Crate icu_properties

source ·
Expand description

Definitions of Unicode Properties and APIs for retrieving property data in an appropriate data structure.

This module is published as its own crate (icu_properties) and as part of the icu crate. See the latter for more details on the ICU4X project.

APIs that return a CodePointSetData exist for binary properties and certain enumerated properties. See the sets module for more details.

APIs that return a CodePointMapData exist for certain enumerated properties. See the maps module for more details.

Examples

Property data as CodePointSetDatas

use icu::properties::{maps, sets, GeneralCategory};

// A binary property as a `CodePointSetData`

assert!(sets::emoji().contains('🎃')); // U+1F383 JACK-O-LANTERN
assert!(!sets::emoji().contains('木')); // U+6728

// An individual enumerated property value as a `CodePointSetData`

let line_sep_data = maps::general_category()
    .get_set_for_value(GeneralCategory::LineSeparator);
let line_sep = line_sep_data.as_borrowed();

assert!(line_sep.contains32(0x2028));
assert!(!line_sep.contains32(0x2029));

Property data as CodePointMapDatas

use icu::properties::{maps, Script};

assert_eq!(maps::script().get('🎃'), Script::Common); // U+1F383 JACK-O-LANTERN
assert_eq!(maps::script().get('木'), Script::Han); // U+6728

Re-exports

Modules

  • This module exposes tooling for running the unicode bidi algorithm using ICU4X data.
  • Data and APIs for supporting specific Bidi properties data in an efficient structure.
  • This module provides APIs for getting exemplar characters for a locale.
  • The functions in this module return a CodePointMapData representing, for each code point in the entire range of code points, the property values for a particular Unicode property.
  • Module for working with the names of property values
  • 🚧 [Unstable] Data provider struct definitions for this ICU4X component.
  • Data and APIs for supporting both Script and Script_Extensions property values in an efficient structure.
  • The functions in this module return a CodePointSetData containing the set of characters with a particular Unicode property.

Structs

Enums